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Abstract. The notion of distillablc entanglement is one of the fundamental con- 
cepts of quantum information theory. Unfortunately, there is an apparent mismatch 
between the intuitive and rigorous definitions of distillable entanglement. To be 
precise, the existing rigorous definitions impose the constraint that the distilation 
protocol produce an output of constant dimension. It is therefore conceivable that 
this unnecessary constraint might have led to underestimation of the true distillable 
entanglement. We give a new definition of distillable entanglement which removes this 
constraint, but could conceivably overestimate the true value. Since the definitions 
turn out to be equivalent, neither underestimation nor overestimation is possible, 
and both definitions are arguably correct. 



Since the concept of distillable entanglement is such a fundamental part of quan- 
tum information theory, it is unfortunate that a gap currently exists between its 
intuitive and rigorous definitions. 

Intuitively, the distillable entanglement of a state p is the maximum over all al- 
lowable protocols of the expected rate at which "good" EPR pairs can be obtained 
from a sequence of identical states. For instance, if we have a protocol which, given 
10 copies of a state p, produces 10 "good" EPR pairs half the time, and fails the 
other half, then we would consider the distillable entanglement of p to be at least 
1/2. Unfortunately, it is not entirely obvious how to make this rigorous; in particu- 
lar, how one should take into account imperfect output when the output dimension 
can vary. For this reason, rigorous definitions [1] of distillable entanglement have 
so far only permitted protocols which always produce the same sort of output; by 
these definitions, we would only be justified in claiming that p has distillable en- 
tanglement at least 1/2 if the above protocol produced 5 "good" EPR pairs all 
the time, rather than 10 only half the time. Consequently, these definitions could 
conceivably have underestimated the "true" distillable entanglement. 

The purpose of the present note is to argue that this is not the case, by giv- 
ing two new rigorous definitions of distillable entanglement which arguably would 
overestimate the intuitive distillable entanglement, and then showing that the new 
definitions agree with the existing definitions. 
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Classes of superoperators 

The concept of distillable entanglement is not quite intrinsic to a state; rather, 
the distillable entanglement can only be defined relative to some specified class of 
legal operations. It will be necessary, therefore, for us to describe which such classes 
we will be considering. 

Recall that any physical operation can be described by a "completely positive 
trace-preserving superoperator" [2], that is an operator S acting linearly on Her- 
mitian matrices such that 1 <g> <S takes density operators to density operators. Any 
such operator can be written in the form 

p~^S iP St=S(p), 

i 

with 

i 

In practice, it is helpful to allow operations which are partially classical; that is 
"measurements" . This corresponds to a decomposition of S as a sum Si in 
which each <S, is a completely positive, but not trace-preserving, superoperator 
mapping to a Hilbert space Vi. To be precise, each Si is of the form 

P^^2 S ijP S ij 

3 

where each Stj has image in Vj, satisfying the condition 

» 3 

The key points arc that the spaces Vi need not be the same, and that the operation 
also produces classical information indicating which Si is applied. These will be 
the basic operations allowed in the sequel, and will be referred to simply as "oper- 
ations" . An operation consisting of more than one superoperator will be said to be 
"measuring" . 

There is a natural notion of composition on operations; given an operation S on 
a Hilbert space V and an operation T on the output space Vi of S, one can compose 
S and T in the obvious way (perform S; if Si was performed, then perform T). 
One can also take tensor products of operations; if S = {Si} and T = {Tj}, then 
we define 

S(g> T = {S l <g> Tj}. 

Finally, if S is an operation such that Si and Sj have the same output space, one 
can produce a new operation that "forgets" which of i or j occurred. 

Definition. A "class" of operations is a set of operations containing the identity 
and closed under all of the transformations of the above paragraph. 

On a bipartite Hilbert space Va <8> Vb, there are five natural classes that have 
been considered in the literature: 
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• Local operations. This is the class of operations of the form 

Sa ® Sb, 

where Sa is a non-measuring operation on Va and Sb is a non-measuring oper- 
ation on Vb- 

• 1-local operations (local operations plus one-way classical communication). This 
is the class generated by the local operations together with all operations of the 
form 

S A ® 1, 

where Sa is an arbitrary operation on Va- (Here, the classical communication 
is from A to B) 

• 2-local operations. This is the class generated by local operations, 

S A ® 1 

for operations Sa, and 

1®S B 

for operations Sb- 

• Separable operations. [3,4,5] This is the class of operations S such that each 
suboperation Si of S is separable; that is, we can write 

S i {p)=Y i {A j ®B j )p{A j ®B j ? 

3 

for operators Aj and Bj on Va and Vb respectively. 

• Positive-partial-transpose (p.p.t.) operations. [6] This is the class of operations 
S such that each suboperation Si has completely positive partial transpose; that 
is, the superoperator 

Sf :p^Si(p r )f 
is completely positive, where T is the partial transpose [7]. 

The first three are classes by definition, and the last two are easily verified to be 
classes. It is also not too hard to verify that each class in our list is contained in the 
next. In fact, the containment is strict in each case. An example of a separable but 
not 2-local operation is given in [5], while the creation of a p.p.t. but not separable 
state (see, e.g. [8]) is an inseparable, but p.p.t., operation; the other cases are 
trivial. 

Definitions of distillable entanglement 

Associated to any class C containing the class of local operations is a notion of 
distillable entanglement. As we have said, the C-distillable entanglement of a state 
p is intuitively defined as the rate at which "good" EPR pairs can be produced 
from copies of p using only operations from C. However, as stated this is not a 
rigorous definition. 

The prototype of our definitions of distillable entanglement is 
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"Definition" . The C-distillable entanglement of a state ponVA^iVs is the max- 
imum number Dc(p) such that there exists a sequence of operations % from C, 
where % takes input (Va <8> Vb)®" 4 , with outputs of the form Vij <g> Vij, and such 
that, as i tends to oo, we have the limits Hi — ► oo, 

— lo S2 dim Vij -> Dc(p), 

1 

and the output of%(p® n ) is "good". Herepij is the probability that the suboperation 
%j is performed given the input state p® ni . 

To define "good" , we will use the notion of fidelity. To any Hilbert space V with 
chosen basis u,, we can associate a maximally entangled state 

<S> + (V)= J— J2vi®Vi. 
V dim V ■ 

This choice of maximally entangled state is by no means canonical; however, since 
any two maximally entangled states of the same dimension are equivalent under 
local unitary operators, the definitions below do not depend on the particular choice 
of Given this convention, the fidelity of a state p on V <8> V is defined by 

F{p) = § + (V)p$ + (V) ] . 

Associated to any % from our prototypical definition, then, is the sequence of 
fidelities Fij of Tij(p® n ). Our main claim, then, is that if we insist that the notion 
of "good" should depend only on the sequences and dim Vij, and the class C 
contains the class of 1-local operations, then there is a unique notion of C-distillable 
entanglement. 

One definition given in the literature [1] is 

Definition 1. The C-distillable entanglement of a state p on Va®Vb is the maxi- 
mum number Dc{p) such that there exists a sequence of non-measuring operations 
% from C, where % takes input (Va <E> Vb)® 1 ^ , with output of the form Vi ® Vi, and 
such that as i tends to go, we have the limits ni — > oo, 

— log 2 dim Vi -» D c {p) 

and 

Fi^l. 

Strictly speaking, they made the additional restriction that dim Vi should be a 
power of 2 for all i; we will call the resulting definition definition 1'. However, by 
Theorem 2 below, these definitions are equivalent. 

It is clear that a sequence of operations satisfying definition 1' can indeed be said 
to have distilled "good" EPR pairs. However, this definition is stronger than one 
would like, intuitively; as evidence of this, note that the "recurrence" protocol of [1] 
does not directly meet this definition. The correct definition, therefore, should allow 
the operations % to be measuring. The problem here is that it is not immediately 
clear how the condition Fi — > 1 should be generalized. 

One possibility is as follows. If a protocol distills entanglement at a given rate, 
then it should certainly be the case that the entanglement of formation of the output 
of the protocol increases at that rate. If we define Ef(F,K) to be the minimum 
entanglement of formation of a bipartite state of dimension K x K and fidelity F, 
then this suggests 
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Definition 2. The C-distillable entanglement of a state p on Va® Vb is the max- 
imum number Dc(p) such that there exists a sequence of operations % from C, 
where % takes input (Va <8> Vb)®"*, with outputs of the form Vij <g> Vij, and such 
that as i tends to oo, we have the limits n, — > oo, 

— ^2 Pi] log 2 dim Vij -» Dc(p), 

and 

— ^ py £/ (Fy , dim Vj )->D c {p). 

Tbi 

3 

Remark. Equivalcntly, the fidelity condition can be stated as 
~~y^P»j( lp g2 dim Vy - Ef(Fij,dimVij)) — > 0. 

One possible objection to definition 2 is that it does not seem to allow the 
possibility of protocols which sometimes fail to produce any result. This is only 
apparently a problem; failure can be modelled by the production of a state of 
dimension 1 (and thus fidelity 1 and entanglement of formation 0). 

Definition 2, if anything, has the problem of being too weak, since entanglement 
of formation is a rather large measure of entanglement. Since this definition is 
equivalent to the too-strong definition f (by theorem 3 below) , this argues that this 
is, indeed, the "right" notion of distillablc entanglement. 

In practice, definition 2 is difficult to work with; it will be convenient, therefore, 
to introduce yet another definition, 

Definition 2'. The C-distillable entanglement of a state p on Va ® Vb is the 

maximum number Dc{p) such that there exists a sequence of operations % from C , 
where % takes input {Va <S> Vb)®"S with outputs of the form Vij ® Vij, and such 
that as i tends to oo, rii — > oo, we have the limits 

— ^2/Pi3 lo S2 dim V iJ -> D C(P), 

rii 

j 

and 

— X! Pi 3 ( X ~ Fi 3 ) l0g 2 dim V V ^ °' 

rii 

3 

Theorem 1. Definitions 2 and 2' are equivalent for all classes C. 

Proof. To show this, we need to know how Ef(F, K) behaves for K large. Although 
we have defined Ef(F, K) by minimizing over all states of fidelity F, it is clear by 
symmetry and the convexity of Ef that this minimum is attained by states of the 
form 

a& + (K)& + (Ky + b; 

we will call such a state an isotropic state of dimension K . The theorem, then, 
follows from Lemma 1 following. □ 
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Lemma 1. The entanglement of formation E of an isotropic state of dimension 
K and fidelity F satisfies 

F log 2 K~H 2 (F)<E<F log 2 K, 

where 

H 2 (F) = -Flog 2 F-(l-F) log 2 (l-F). 

Proof. For the upper bound, write the state as a convex combination of the isotropic 
state of fidelity 1 (with entanglement of formation log 2 K) and the (separable) 
isotropic state of fidelity (with entanglement of formation 0), and use the con- 
vexity of the entanglement of formation to obtain an upper bound of 

FK - 1 1 - F 

R _ 1 log 2 K = F\og 2 K - j^-j log 2 K < Flog 2 K. 

For the lower bound, we use the fact [6] that E is bounded below by the 
positive-partial-transpose bound on distillablc entanglement. For isotropic states, 
this bound was explicitly calculated to be 

log 2 K + Flog 2 F + (1 - F) log 2 (l - F) - (1 - F) \og 2 (K - 1) 

= Flog 2 K - H 2 {F) + (1 - F) \og 2 (K/(K - 1)) 
>Flog 2 ^-if 2 (F). 

□ 

Another definition which has been proposed [9] replaces the fidelity condition by 

infFij 1. 

j 

Clearly, the distillable entanglement according to this definition lies strictly between 
the values according to definitions 1 and 2', so equivalence to our definitions follows 
from theorem 3 below. 



Basic protocols 

To show the remaining equivalences between the definitions, we will need some 
basic transformations of isotropic states. For instance, if we are given an isotropic 
state of dimension K, to what extent can we transform this into an isotropic state 
of dimension K' < K without significantly reducing the fidelity? We consider two 
protocols, both local and symmetric between "Alice" and "Bob" (the two subsys- 
tems). 

In the first protocol, Alice's portion of the protocol is to measure the subspace 
generated by the first K' basis elements. If Alice finds that her portion of the 
state is in that subspace, she does nothing; otherwise, she fails, i.e., replaces her 
portion of the state with a random element of that subspace. Bob performs the 
same protocol. 

If Alice and Bob are given an isotropic state of fidelity 1, it is easy to see that 
this protocol produces an isotropic state of fidelity 1 if both Alice and Bob succeed 
in their measurements (probability K'/K), and otherwise the protocol produces a 
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completely random state. On the other hand, on a completely random state, the 
protocol will produce a completely random state. Thus the protocol must take the 
state 

to the state 

In other words, the state of fidelity F is taken to the state of fidelity 

, ,, , (K-K')((l-F)K'(K' + K)+K 2 - l) . 
(K'/K)F + ( - U K ,J K{ ^ I) " ^ {K /K)F. 

In the second protocol, we require that K' be a factor of K. Both Alice and 
Bob interpret their state space as a tensor product of spaces of dimension K' and 
K/K', then trace away the space of dimension K/K'. Here a state of fidelity 1 
maps to a state of fidelity 1, while a random state maps to a random state. Thus 
the state 

a $+(iq$+(iqt + i__^ 

is taken to the state 

a<f>+(K')<P+(K'y + - 



K' 2 ' 



or in other words, the state of fidelity F is taken to the state of fidelity 



Combining these protocols, we obtain the lemma 

Lemma 2. For any pair K' < K , there exists a local operation which, given an 
isotropic state of dimension K and fidelity F , produces an isotropic state of dimen- 
sion K' and fidelity at least 

K' K max(K-K',K>) 
K l K /] ~ K 

More generally, for any state of dimension K and fidelity F, there exists a local 
operation which produces a state of dimension K' and fidelity as stated. 

To be precise, we first use protocol 1 to reduce the dimension to K'[j^r\ and 
then use protocol 2 to reduce the rest of the way. For non-isotropic states, we note 
that if we were to "twirl" [1] the state by a random operator of the form U (g> U, 
we would get an isotropic state of the same fidelity. Since twirling is not local 
(only 1-local), this is not quite enough. However, as in [1], one can then argue 
that some choice of U must obtain this fidelity, since the average U does so, and 
fidelity is linear. So for K'/K close to either or 1, we can reduce to dimension 
K' without significantly reducing the fidelity, via purely local operations. We do 
not know what can be done in general for intermediate values of K'/K. (Locally, 
that is; if one allows 1-local operations, one can simply tclcport half of a maximally 
entangled state of dimension K' through the given state (M. and P. Horodecki, 
personal communication).) However, what we have shown is enough to give 
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Theorem 2. Definitions 1 and V are equivalent for any class C containing that 
of local operations. 

Proof. Clearly, any sequence of operations giving a lower bound on Dc (p) according 
to definition 1' also satisfies the conditions of definition 1. Suppose, therefore, that 
we are given a sequence of operations satisfying the conditions of definition 1. We 
need to show that there exists a sequence of operations of the same rate in which 
the output always has dimension a power of 2. 

Let Ki be the sequence of output dimensions. Let K[ be defined for each i to be 
the largest power of 2 less than Ki/rii. Then we observe the following: 



In particular, applying lemma 2, we can produce a new sequence of operations with 
output dimensions K[ and with output fidelities 



tending to 1. Since the K[ are powers of 2, and have the same value of ^- log 2 K[ 
as i — ► co, we are done. □ 

Similarly, in definition 2' we may assume that all output dimensions are powers 
of 2; the only complication is that some Kij might be less than n, making K'^ less 
than 1. This is simple to fix, however: if K^j < n, take K'- = 1, making F'- = 1. 

To show that definitions 1 and 2' arc equivalent, we will need the following result: 

Lemma 3. If K is a power of 2, then the 1-locally distillable entanglement (ac- 
cording to definition 1) Di(F,K) of the isotropic state of fidelity F and dimension 
K satisfies 




lim K'JKi = 0. 



Di(F, K) > (2F - I) log 2 K - H 2 (F) > (2F - 1) log 2 K 



1. 



Proof. The "hashing" protocol [I], as extended in [10], gives 



D\{F, K) > log 2 K + Flog 2 F + (1 - F) log 2 ((l - F)/{K 2 



!))• 



But 



log 2 K + Flog 2 F + (1 - F) log 2 ((l - F)/(K 2 - 1)) 

= (2F - 1) log 2 K - H 2 (F) + (1 - F) log 2 (K 2 /(K 2 - I)) 
> (2F-l)log 2 K-H 2 (F). 



□ 



Remark. (1) Indeed, this is true when K is & power of an arbitrary prime (H. 
Barnum, D. DiVincenzo, personal communication), but we will not need this in the 
sequel. It is not clear what happens for general K. (2) It is easy to verify that this 
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is true for a state of dimension 1, since then F = 1, and in that case the bound 
says only that D\(l, 1) > 0. (3) This result is true even if we insist in definition 
1 that rii — i. (4) It would be nice to have some bound of this sort be true using 
only local operations; to be precise, if one could show that 

D (F, K) > (aF -(a- 1)) log 2 K - o(log 2 K) 

as K — > oo, for some constant a, then this would allow theorem 3 to be proved 
using only local operations. Alternatively, if one could show that Do = for all 
impure states, the question of whether the definitions are equivalent given only 
local operations would become moot. 

The main theorem 

Theorem 3. If C contains the class of 1-local operations, then definitions 1, 1', 
2, and 2' are all equivalent. 

Proof. By theorems 1 and 2, it suffices to show that definitions 1 and 2' are equiv- 
alent. Certainly, any sequence of operations satisfying the conditions of definition 
1 will also satisfy the conditions of definition 2'. Suppose, therefore, that we are 
given a sequence of operations % from C satisfying the conditions of definition 2'. 
Moreover, assume that each output dimension is a power of 2 (which we can do, 
by the remark after theorem 2). Finally, by 1-local twirling, we may insist that the 
output is always an isotropic state. 

For each i, consider the operation T® k for large k. For any set of probabilities 
Pij < Vih the probability that T® k produces at least copies of output j 

can be made arbitrarily close to 1 by taking k sufficiently large. If we also choose 
numbers R'^ with 

%i <(2F ij -l)log 2 K ij -l, 

then lemma 3 tells us that, given p'^k states of dimension and fidelity Fij, 

we can produce, via 1-local operations, states of dimension [2 Ri i Pi i k \ with fidelity 
tending to 1 as k tends to infinity. 

We can thus use the following protocol. Apply T® k for sufficiently large k. If we 
obtain at least p'^k states of type j, then apply the hashing protocol to the states 
of type j for each j. This results in a state of constant dimension 

K[{k)=Y{[2 R '^ k \ 

3 

and fidelity tending to 1. On the other hand, if we do not obtain the desired numbers 
of states of each type, simply produce a random state of dimension K'. Since the 
probability of this occurring can be made arbitrarily small, the resulting fidelity 
still tends to 1 as k tends to infinity. Thus we have a sequence of operations Ok 
taking as input r^fc copies of p and producing as output a state of dimension K^k) 
with fidelity tending to 1. This already tells us that the C-distillable entanglement 
of p according to definition 1 is at least 

rii J J 
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Since this is true for arbitrary R' and p' satisfying the above inequalities, we have 
D C {p) > (-Y,Pv^F l3 - 1) log 2 K)-- 

Til Tli 
J 

for each i. Letting i tend to infinity, the theorem is proved. □ 

Remark. A similar argument shows that we did not err in our definitions in 
allowing an arbitrary sequence of input sizes rij. To be precise, for any given rate 
R less than the C-distillable entanglement, there is certainly some i such that the 
hashing protocol on the ith output achieves rate at least R asymptotically. This 
gives a sequence of operations with n' k — riik. But then for any number of inputs not 
in this sequence, one can simply discard inputs as necessary, without significantly 
changing the rate. This gives a sequence of operations with n'( — i demonstrating 
that R< D c - 
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